Skip to content

Zone Map Pruning for Metrics#6363

Open
alexanderbianchi wants to merge 1 commit intomainfrom
bianchi/zonemap
Open

Zone Map Pruning for Metrics#6363
alexanderbianchi wants to merge 1 commit intomainfrom
bianchi/zonemap

Conversation

@alexanderbianchi
Copy link
Copy Markdown
Collaborator

@alexanderbianchi alexanderbianchi commented Apr 29, 2026

Summary

  • Adds conservative scan-time metadata pruning for metrics splits after the metastore split list is fetched.
  • Extracts string equality and IN predicates from DataFusion filters and evaluates them against split metadata before building Parquet file groups.
  • Prunes with exact metric_name metadata, exact low-cardinality tag metadata, and per-column zonemap_regexes when present.
  • Keeps splits when relevant metadata is missing or a zonemap regex is invalid, so older or partially populated metadata cannot produce false negatives.
  • Marks metric and tag predicates as inexact pushdown so DataFusion passes them into the TableProvider scan while still applying the row-level filter later.
  • Updates the metrics integration test helper to preserve writer-produced zonemap metadata and adds an end-to-end pruning regression test.

Notes

This intentionally does not push tag predicates into the metastore query yet. Metric name and time range still drive metastore-side pruning; tag and zonemap pruning runs locally on returned split metadata before Parquet files are read.

Testing

  • cargo fmt --package quickwit-datafusion
  • cargo test -p quickwit-datafusion

@alexanderbianchi
Copy link
Copy Markdown
Collaborator Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant